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ABSTRACT A plasmid was constructed that facilitates the 
cloning and expression of open reading frame DNA. A DNA frag- 
ment containing a bacterial promoter and the amino terminus of 
the cl gene of bacteriophage A was fused to an ammo-termihally 
deleted version of the lacZ gene. An appropriate cloning site was 
inserted between these two fragments such that a frameshift mu- 
tation was introduced upstream of the lacZ-encoding DNA. This 
cloning vehicle produces a relatively low level of 0-galactosidase 
activity when introduced into Escherichia coli The insertion of 
foreign DNA at the cloning site can reverse the frameshift mu- 
tation and generate plasmids that produce a relatively high level 
of 0-galactosidase activity. A large fraction of these plasmids pro- 
duce a fusion protein that has a portion of the A cl protein at the 
amino terminus, the foreign protein segment in the middle, and 
the lacZ polypeptide at the carboxyl terminus. The production of 
a high level of 0-galactosidase and a large fusion polypeptide guar- 
antees the cloning of a DNA fragment with at least one open read- 
ing frame that traverses the entirety of the fragment. Hence, the 
method can identify, clone, and express (as part of a larger fusion 
polypeptide) open reading frame DNA from among a large col- 
lection of DNA fragments. 

The use of recombinant DNA technology has increased our 
understanding of the organization and expression of eukaryotic 
DNA. While many goals can be achieved with existing tech- 
nology, certain features of eukaryotic gene organization are still 
attainable only with difficulty. In particular, the identification 
of protein-coding segments within a large block of eukaryotic 
DNA is quite challenging. One should be able to take advantage 
of properties that are unique or highly enriched in coding DNA. 
Perhaps the most ubiquitous and specific feature of coding DNA 
is the presence of open reading frames. Indeed, analysis at the 
DNA sequence level has proved invaluable in identifying genes 
or gene segments. Although there are many examples of genes 
in which relatively small exons are interrupted by relatively 
large and numerous introns, this characteristic appears to be 
relatively infrequent in lower eukaryotes. Also, higher eukary- 
otic genes have been analyzed that contain open reading frames 
of substantial length. 

In this report, we describe a method for cloning pieces of 
DNA based on this property. We have exploited the elegant 
work of Beckwith and co-workers (1), who have demonstrated 
the power and utility of gene and operon fusions to the Esch- 
erichia coli jB-galactosidase gene (the lacZ gene). Many of these 
studies rely on the fact that the enzyme, /J-galactosidase ifi-D- 
galactoside galactohydrolase, EC 3.2.1.23) is often biologically 
active when it carries other protein segments at its amino ter- 
minus. Indeed, it appears that in yeast, as well as in £. co/i, /?- 
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galactosidase can carry a wide range of protein sequences at its 
amino terminus and still retain substantial biological activity 
(2, 3). We have taken advantage of this property to create a lacZ 
vector for the cloning of open reading frames. We have intro- 
duced a frameshift mutation by inserting a DNA linker at the 
junction of a cllccZ fused gene. As expected, this frameshift 
results in a relatively low level of hcZ activity. The DNA linker 
encodes a cloning site. Consequently, the vector is able to 
screen for segments of open reading frame DNA, since a con- 
tinuous open reading frame sequence is a necessary condition 
for high-level expression of the lacZ sequence downstream from 
the insert. A successful construction results in the synthesis of 
a lacZ fusion polypeptide that includes the protein segment 
coded for by the inserted open reading frame. The vector is able 
to identify, clone, and express open reading frame DNA in one 
step. 

METHODS AND MATERIALS 

Materials. Restriction enzymes, T4 DNA ligase, and BAL- 
31 nuclease were purchased from New England BioLabs and 
Bethesda Research Laboratories. Calf intestine phosphatase 
was purchased from Boehringer-Mannheim; 5-bromo-4-chlo- 
ro-3-indolyl j3-D-galactoside (XGal), o-nitrophenyl 0-D-galac- 
topyranoside (ONPG), and isopropyl 0-D-thiogalactoside 
were purchased from Sigma. XGal plates and MacConkey agar 
plates (Difco) were prepared as described by Miller (4). Sma 
l/BamHl adapter was purchased from New England BioLabs. 

Antibodies. Rabbit anh-0-galactosidase was a gift of Beth 
Rasmussen, rabbit anti-cl was a gift of Jeffrey Roberts, and the 

I-labeled IgC fraction of a goat anti-rabbit immunoglobulin 
Fc serum was a gift of Susan Lowey and Joan Press. 

Bacterial Strains. The lac deletion strain LG90 (F - Mac pro 
XIII) was used exclusively (2). 

Plasmid DNA Preparation. Plasmid DNA was prepared by 
standard CsCl/ethidium bromide centrifugation methods (5) or 
from minilysates prepared by using the alkaline lysis procedure 
(6). 

Plasmid Constructions. Plasmid DNA was digested with 
commercial restriction enzymes as recommended by the sup- 
pliers. Where indicated, restricted DNA was treated with suf- 
ficient phosphatase [10 mM Tris-HCI (pH 8.0), 30 min at 37°C] 
to reduce self-ligation of a comparable amount of vector DNA 
by a factor of at least 1/100. DNA was sonicated in 5 ml of TE 
buffer (10 mM Tris-HCl, pH 8.3/1 mM EDTA)/0.2 M NaCI 
with six 30-sec bursts at maximum power, concentrated on a 
0.2-ml column of DEAE-cellulose, eluted with TE buffer/1 M 
NaCI, and precipitated with ethanol. DNA was resuspended in 
TE buffer, digested lightly with BAL-31 nuclease (sufficient to 

Abbreviations: XGal, 5^bromo-4-chloro-3-indolyl fl-D-galactoside; bp 
base pair(s). 
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remove approximately 25 nucleotides from each end of a similar 
quantity of control restriction enzyme-digested DNA frag- 
ments) at 30°C, and fractionated by size on 10% acrvlamide/ 
Tris borate/EDTA (TEE) gels along with appropriate molecular 
weight markers (7), 

Ligations. Cohesive-end ligations were carried out as follows: 
100 ng of plasmid vector and various amounts of eulcaryotic 
DNA were ligated for 3 hr at 22°C in a final volume of 100 M l 
with 0.1 (Bethesda Research Laboratories) unit of T4 DNA li- 
gase. Blunt-end ligations were typically carried out as follows: 
500 ng of plasmid vector and various amounts of potential insert 
DNA were ligated overnight at 16°C in a final volume of 10 ul 
with 1 unit of T4 DNA ligase. 

Transformation. Bacterial transformation into LG90 was ear- 
ned out by standard procedures (8), with the following modi- 
fications. All ligated DNA samples were phenol extracted and 
ethanol precipitated prior to transformation, which caused a 
reproducible 5- to 10-fold increase in transformation efficiency. 
Transformed cells (1 ml) were spread on the appropriate indi- 
cator pktes containing ampiciliin at 25 /ig/ml. Colonies were 
^ if 0r L P , n0type on MacConk ey agar plates after 24 hr at 
3rc If the plates were crowded, the colony phenotype was 
scored at 48 hr. 

Hybrid Protein Analysis. Three-milliliter samples of cells 
from a saturated culture (grown in L broth containing ampiciliin 
at 50 >g/ml) were centrifuged, the pellet was suspended in 150 
{«£ r o conce ? tl ? ted M"Pfc buffer (9) and heated at 
IWX tor 3 min, and the viscosity was reduced by repeated 
passages through a 22-gauge syringe needle. Between 4 and 8 
Ml of the protein samples were electrophoresed in 7.5% acrvl- 
amide gels (9). Protein blotting was carried out according to 
published procedures (10). 

£GaIactosidase Assays. These were carried out according 
to Miller (4) except that cells were grown in L broth. 
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RESULTS 

Our initial hypothesis was that a frameshift mutation placed near 
the amino terminus of an open reading frame sequence that 
contains the hcZ sequence at its carboxvl terminus would pro- 
duce a plasmid that would be phenotypicallv lacZ" when trans- 
formed into E. coll Furthermore, the insertion of foreign DNA 
at or near the frameshift could correct the frameshift mutation 
and generate a /ccZ* insert DNA-containing plasmid. The two 
constructions shown in Fig, 1A were designed to test this hv- 
potnesis. A segment of the cl gene from bacteriophage A ter- 
minating at a Hindlll site at codon 157 was fused to a lacll frag- 
ment. The cl gene fragment also contains two lac promoters that 
are oriented toward cl . Both constructions were identical except 
that two different &cZ-containing plasmids were used. These 
two plasmids, pLG200 and pLC400, differ in DNA sequence 
only in the immediate vicinity of the Hindlll site (12). Because 
the DNA sequence of the A cl gene is also known (13), as well 
as the reading frame of the lacI-lacZ (lacIZ) fusion protein en- 
coded by both pLG200 and pLG400 (12), we could predict the 
reading frame of the two resultant plasmids. Plasmid pMRl has 
24 bases between the lysine codon at amino acid 157 in the cl 

fp" 6 c inC C ° d0n at ^ e inning of the loci fragment 
(ng. 1A). Since none of these eight codons are stop codons, this 
construction reads in frame from the A cl gene into the lacIZ 
rusion and consequently produces a high level of 0-galactosidase 
activity. In contrast, pMR2, constructed from pLC200, has 10 
bases between the AAG lysine codon of the A cl gene and the 
^ I G leucine codon of lad . Because this number of intervening 
bases is not divisible by 3, this construction should shift the 
frame, resulting in improper reading of or termination in (or 
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Jul /Lon^f^ 00 ° f • vec ? ,r8 J ( A ) PlaamidspKB252(ll) (P LG200 ( 
and pLG400 ( 2) were manipulated by standard procedures to produce 

tXn°% Briefly, ?KB252 w« dfe£d 

(or pU3400) was dige 8 ted with Sma 1/HinM and the largest fragment 
cUI n-resistant colonies were selected. The restriction sites of the re- 
t^ft*"*! ^ I erified ' pMR1 waa ***** by using 
f^ffin W irin n f V6d ^^P^ 20 * Both plasmids, whenW 
Kfi^iP 90 ^ 1 M if SW" on XGaI Thus, lac* and 
** thfi ^^P^^^MacConkeyagarpIates .The DNA sequence 
™° a ^ l Ti if aequence of the A c/ gene up *f£ 

f ^?o£f* U J 3) t a 5 d 8e( l uence ia the vicinity of the HinMi site 
l P iS°°JSf pL 9H°° (12) * <*> °" hundred i^ognuL oSm 
*f*S? ^ 2 ng of Bamm/Sma I adapter w^gated^inal 

°™ed into LG90 f and ampicimn-resistant lac" colonies 
were scored on MacConkey agar plates. DNA sequences in the vicinity 
of the cloning sites were verified by direct sequence analysis. 

both) the lacIZ fusion sequence. This plasmid results in a rel- 
atively low level of 0-galactosidase activity. In addition, the in. 
sertion of eulcaryotic DNA into the Hindlll site of pMR2 is able 

^wnT^ 6 CS *** 8 High ICVeI ° {bcZ **** (data n0t 
To construct :r vector of more utility, pMRl was manipulated 
a m c g r J B ' m t OTnst n"*K>n inserted 10 base pairs (bp) at 
the BamHI site and therefore changed the reading frame so that 
the lacZ sequence is no longer properly translated. Moreover 
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it has created a blunt-ended Sma I site, flanked by two BamHl 
sites, within this frameshifted region. The Sma I site provides 
a cloning site into which blunt-ended DNA can be cloned. In- 
sert DNA can be excised by digestion with BamHl. 

To verify that this second-generation vector was suitable for 
cloning open reading frame DNA, Hae Ill-digested pBR322 was 
cloned into the Sma I site of pMRlOO. After transformation into 
LG90, 28 red colonies were chosen at random and analyzed for 
plasmid DNA insert size. All 28 clones yielded an additional 
BamHl fragment when analyzed by acrylamide gel electropho- 
resis; 8 of these clones are shown in Fig. 2. A very limited subset 
of the possible pBR322 Hae III fragments are cloned when am- 
picillin-resistant colonies are chosen on the basis of a strong 
lacZ + phenotype. As expected, most white colonies from this 
ligation contain plasmid DNA with no detectable insert, since 
most of these colonies are presumably derived from recycliza- 
tion of the vector (see Table 1). White colonies that harbor a 
plasmid with a DNA insert do not show the dramatic insert size 
preference seen with red colonies; DNA fragments of many 
different sizes are obtained from white colonies (data not 
shown). 

The sequence of pMRlOO predicts that two criteria must be 
met for a DNA fragment inserted into the Sma I site to generate 
a proper open reading frame between the A cl fragment and the 
lacll fragment, (i) The inserted DNA must be of a size 3n - 
1 (where n is an integer) and («) the inserted DNA must contain 
no stop codons in the reading frame set by the frame of the A 
cl gene. Examination of the size and sequence (in both orien- 
tations) of the pBR322 Hae III fragments predicts that, of the 
22 Hae III fragments, only 3 meet the two criteria: a 104-bp 
fragment from the tetracycline-resistance gene (which contains 
the single pBR322 BamHl site), an 89-bp fragment, and an 80- 
bp fragment, the latter two from the ampicillin-resistance gene. 
The 89-bp fragment has an appropriate open reading frame in 
both orientations, while the 104-bp fragment is cleaved by 
BamHl into two fragments, a 78- and a 26-bp fragment. Con- 
sequently, the sequence data predict that 4 fragment orienta- 

I 23456769 10 



Fig. 2. Cloning of Hae III fragments of pBR322 into pMRlOO. 
pBR322 DNA was digested to completion with Hae III and the resul- 
tant fragments digested with phosphatase (to prevent multiple in- 
serts). The phosphatase-digested DNA was ligated with pMRlOO and 
transformed into LG90. Red colonies were picked, and DNA was iso- 
lated from minilysates and digested with BamHl. DNA was analyzed 
on 10% acrylamide/Tris borate/EDTA gels. Lanes: 1 and 6, Hae III 
pBR322 fragment standards; 2-5 and 7-10, DNA from random red 
colonies. Arrows indicate the 104-bp and 89-bp Hae III pBR322 
fragments. 
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Table 1. Cloning of pBR322 Hae ITJ fragments into pMRlOO 
and pMR200 



Ligation 




Vector/pBR322 


Colonies 


mixture 


Vector 


ratio* 


Red White 


1 


pMR200* 


1:1 


349 13 


2 


pMR2O0 


1:5 


300 65 


3 . 


pMRlOO 


1:5 


11 594 



Phosphatase-digested pBR322 Hae tH fragments were ligated witl 
Sma I-digested pMRlOO or Sma I-digested pMR200. Ligated DNA wai 
transformed into LG90 and colonies were scored on MacConkey agsj 
plates. Because pMR200 is a lacZ* vector having the Sma I cloning siu 
at the same position as pMRlOO, the fraction of white colonies ~ tht 
fraction of religated vector that contains an insert This value wai 
~20% in ligation mixture 2 but, in ligation mixture 3 under identicaJ 
conditions, 1 1/606 colonies are red Normalizing for the expected frac- 
tion of colonies from this ligation that contain an insert (-20% - same 
value as ligation mixture 2), the fraction of pBR322 Hae III fragments 
that can reverse the frameshift « (11/605) x 0.2 ~ 10%. 
•Arbitrary. 

f pMR200 was derived from pMRlOO as a rare lacZ* colony (on 
MacConkey agar) after digestion of pMRlOO with Sma I, aelf-liga- 
tion, and transformation. It is almost certainly a single-base-pair 
deletion of pMRlOO, placing the reading frame back in register. [The 
Sma I site (Fig. 1) is retained despite the deletion.] The size of the 
lacZ polypeptide is identical to that produced by pMRl and all re- 
striction sites in pMRlOO are still present. Since almost all inserte 
into the Sma I site are lacZ~, the vector can be used to measure insert 
frequency. 

tions out of a possible 44 (22 fragments x two orientations) 
should produce red colonies. This value of 10% is consistent 
with experiment (Table 1). Moreover, there should be two size 
classes of insert fragments produced by BamHl digestion, 103 
and about 93 nucleotides (89- and 78- to 80-nucleotide frag- 
ments plus 14 nucleotides from the DNA between the Bam/ 
Sma sites shown in Fig. 1). Consistent with expectation, 26 of 
the 28 red colonies examined at random have one of the two size " 
classes of DNA insert shown in Fig. 2. Moreover, double diges- 
tion of plasmids derived from these red colonies confirms that 
some of the larger size DNA inserts are indeed derived from 
the 89-bp Hae III fragment, which has been cloned in both 
orientations. 

The above data suggest that the vector pMRlOO can select 
suitable open reading frame fragments from among a larger 
group of DNA segments. Because of the nature of the cloning 
vehicle, a productive DNA insert should be expressed as a part 
of a larger fusion polypeptide. To verify this prediction, we 
cloned into pMRlOO sonicated DNA (carefully size fractionated 
taavpid cloning small DNA fragments) derived from a cloned 
piece of Drosophila melanogaster DNA. Eleven red colonies 
were chosen for detailed analysis; the data are summarized in 
Table 2. Five of the 11 colonies contain proteins that are bigger 
(by approximately 20,000 daltons) than the initial cl-lacIZ fusion 
protein, consistent with a DNA insert size of approximately 500 
bp (Table 2 and Fig. 3). These large proteins react with both 
anti-cl and anti-^galactosidase antisera (Table 2 and Fig. 4), 
consistent with the hypothesis that they are proteins of tie gen- 
eral structure cl-eukaryotic piece-lacIZ. These five strains con- 
tain additional smaller proteins, the largest of which is approx- 
imately 20,000 daltons less than the initial cl-lacZ fusion protein 
of pMR200. This protein reacts positively with anti-0-galacto- 
sidase antiserum but does not react with anti-cl antiserum. A 
protein with similar properties is also visible in the strain with 
pMR200. Two of the colonies (nos*. 2 and 4) have properties 
suggesting that they are lacZ* revertants of pMRlOO generated 
by a deletion of perhaps 1 bp during cloning. Four of the col- 
onies (nos. 7, 9, 10, and 11) contain only the smaller proteins 
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Table 2. Insertion of cloned sonicated Drosophila DNA 
into pMRlOO 



Insert 0-Gal 
size activity 



Protein 



Protein blot 



Clone 












1 


425 


1,464 


Small, big* 


Big 


Small, big 


2 


— 


ND 


pMR200 


ND 


ND 


3 


515 


1,547 


Small, big 


Big 


Small, big 


4 




ND 


pMR200 


ND 


ND 


5 


425 


1,569 


Small Kiv 


Big 


Cm oil u:_ 

Oman, Dig 


6 


520 


1,685 


Small, big 


Big 


Small, big 


7 


460 


660 


Small 




Small 


8 


440 


795 


Small, big 


Big 


Small, big 


9 


400 


762 


Small 




Small 


10 


530 


441 


Small 




Small 


11 


400 


351 


Small 




Small 


LG90 




2 




ND 


ND 


pMRlOO 




30 








pMR200 




10,661 


pMR200 


Big 


Small, big 



~ *" wy'*** metanaganer uone, pi/mzBd /, was a generous gift of 
Welcome Bender. This eukaryotic insert, an 8.1-kilobase (kb) Sal I 
fragment in the Sal I site of pBR322, has been identified on the basis 
of genetic criteria to include the ry+ gene, which codes for xanthine 
dehydrogenase (Welcome Bender and Arthur Chovnick, personal com- 
munication). The 8.1-kb Sal I insert was purified by agarose gel elec- 
trophoresis and electroelution and then sonicated. DNAs of 400-550 
bp were selected for cloning, and 11 red colonies were chosen for sub- 
sequent analysis. No colonies or data were discarded from the analysis. 
£-Gal, 0-galactosidase; — , no detectable insert or protein band: ND 
not done. 

* Protein size refers to the prominent 0-galactosidase polypeptides) 
visible on NaDodS0 4 gels as assayed by Coomassie blue staining and 
blotting with anti-0-galactosidase antibody, 

f "Big" refers to a 0-gaJactosidase polypeptide 15,000-20,000 daltons 
larger than the cl-lacIZ fusion in pMR200. "Small" refers to a fl-ga- 
lactosidase polypeptide 15,000-20,000 daltons smaller than the 
ci-lacIZ fusion in pMR200; none of these "small" proteins reacts with 
anti-cl-antibody. 

(no. 7 is shown in Figs. 3 and 4). The properties of these poly- 
peptides are identical to those of the small protein of the other 
clones described above. The origin of this small protein is un- 
certain (see Discussion). 
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Fig. 3. Gel analysis of hybrid proteins. Proteins from various 
sources were analyred by NaDodS0 4 gel electrophoresis and the gel 
was stained with Coomassie blue. Lane* 1, HB101; 2, strain HB101 
was grown overni^tat 37«C in the presence of 1 mM isopropyl-/^ 
^S^f^^i 3 V LG ?2 : 4 ' P^lOO; ^ pMR200; 6-ll, clones l,3,and 
5^8, res^vely (see Table 2). Arrows: lower, wild-type lacZ (lane 2)- 
middle, d-UcE fusion protein (lane 5); upper, large inaerUierived fu- 
sion proteins (lanes 6-9 and 11). 

- 1) and entirely open reading frame ^(1/3) x (0.953) 1 = 55 
X 10-< for 400-bp DNA and 1.1 x 10" 4 for 500-bp DNA. The 
probability that a segment of DNA that is derived from a larger 
piece of open reading frame DNA is of the correct length and 
entirely open reading frame = the probability that it is the cor- 
rect eng* (1/3) x the probability that it is in the correct frame 
(1/3) X the probability that it has cloned in the correct orien- 
tation (1/2) « 5.5 x 10' 2 . The difference between these num- 
bers suggests that an open reading frame clone from a 400- to 
500-bp DNA insert can be considered with some confidence to 
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DISCUSSION 

The experiments presented here validate our initial hvpothesis: 
a plasmid vector having a frameshift near the amino terminus 
of a gene that codes for a polypeptide with lacZ + activity can 
be used to identify, clone, and express open reading frame seg- 
ments of DNA. Although other methods are available to carry 
out these functions, they usually require detailed information, 
often at the DNA sequence level, about the gene or genes of 
interest. The value of the vector and approach described here 
is that no information need be known to clone and express frag- 
ments of open reading frame DNA from among a larger number 
of fragments. 

The data presented have been chosen from a large number 
oi similar experiments to illustrate some important features of 
the methodology. When shotgun-cloning random fragments of 
DNA, it is important to size the DNA carefully prior to ligation. 
Because the frequency of random open reading frames increases 
m inverse proportion to DNA fragment size, the selection of red 
colonies will also select for small DNA inserts, even if the av- 
erage DNA size is quite large (data not shown). In the absence 
ot any information about the DNA to be cloned in this way we 
have chosen a DNA insert size of 400-500 bp. The probability 
that a random piece of DNA of length x is the correct size (3n 



Fig. 4. Blot analysis of hybrid proteins. Proteins from various 
sources were analyzed by protein blotting and antibody staining with 
•SfKL?? w£ M> and anti-/3-galact08idase antibody (5). Lanes: 1 
pMRlOO; 2, pMR200; 3-$, clones 1 and 3-8, respectively (see Table 2) 
TTie arrow indicates the small protein discussed in the text. The ad- 
ditional smaller protein bands present in all lanes (except lane 1) are 
probacy fusion protein degradation products that lack discrete por- 
tions from the carboxyl terminus, since these bands are visible with 
anti-cl antibody. 
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be a bona fide open reading frame in the DNA from which it 
was derived. This level of confidence decreases considerably 
as a function of insert size. Such an argument suggests that genes 
with small exons and large introns are not generally approach- 
able in this way although sonicated double-stranded cDNA 
could be cloned in a similar fashion. 

The data in Table 1 argue that the strategy enriches signifi- 
cantly for open reading frame DNA segments. On the other 
hand, some false positives are certainly generated, as shown in 
Table 2 and Figs. 3 and 4. Some red colonies contain plasmid 
DNA with no detectable DNA insert and a lacZ polypeptide of 
a molecular weight identical to the initial cl-lacIZ fusion pro- 
tein. As indicated above, these colonies are probably due to a 
small amount of exonuclease activity present during Sma I 
digestion or ligation. This source of red colonies is more prob- 
lematic when the fraction of bona fide open reading frame col- 
onies is low [e.g., when cloning genomic DNA from higher eu- 
karyotes (unpublished experiments)]. There are also red col- 
onies with proper DNA inserts but no detectable large fusion 
proteins. We have generated colonies of this phenotype from 
many sources of eukaryotic DNA (data not shown). Iriey are 
always associated with the presence of a small lacZ polypeptide 
of molecular weight similar or identical to the lacIZ-encoded 
portion of the fusion protein. Because a similar lacZ polypeptide 
is visible in protein from pMR200 and all of these small poly- 
peptides fail to react with anti-cl antibody, we favor the inter- 
pretation that they are due to proteolysis of the larger cl-pos- 
itive fusion protein. Presumably, this occurs to a variable extent 
with different fusion proteins. Alternatively, these small pro- 
teins from the insert DNA-containing plasmids may be due to 
translational restarts near the insert-focJZ junction, this matter 
might be clarified by determination of the DNA sequence of 
several inserts from plasmids of this type. 

Despite these two types of fusion proteins, a substantial frac- 
tion of red colonies contain bona fide open reading frame inserts 
as determined by the nature of the fusion protein they generate 
(Fig. 4). We expect this general approach to prove useful for a 
number of purposes. In addition to those described here, many 
of the fusion proteins generated in this or in a similar way may 
be antigenic and immunogenic. Large numbers of Iac + colonies 
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could be selected on lactose minimal plates and screened wit 
immunological reagents. Thus, it should be feasible to clon 
specific genes on this basis. The methodology should also serv 
as a rapid and convenient way to proceed in the opposite di 
rection [i.e., from a gene (or a subgene region) to an antibod 
reagent directed against a portion of the proteins coded for b 
the starting DNA]. 

We are grateful to the individuals who supplied reagents. Also w. 
thank Drs. Donald Winkelmann and L. E. Cannon for advice abou 
protein blotting and M. A. Osley and L. M. Hereford for helpful dis 
cussions. This work was supported by National Institutes of Healt) 
Grant HD-08887 (to M.R.). M.R.G. was supported bv a predoctora 
Kessner Fellowship. 

Note Added in Proof. The large proteins from clones 5 and 6 react pos 
itively with an tuan thine dehydrogenase antiserum (a gift of Arthu 
Chovnick), showing that the eukaryotic protein pieces are antigenic. 
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